Search CORE

3,016 research outputs found

Learning from Noisy Label Distributions

Author: A Culotta
CM Bishop
F Pedregosa
TG Dietterich
Publication venue
Publication date: 10/08/2017
Field of study

In this paper, we consider a novel machine learning problem, that is, learning a classifier from noisy label distributions. In this problem, each instance with a feature vector belongs to at least one group. Then, instead of the true label of each instance, we observe the label distribution of the instances associated with a group, where the label distribution is distorted by an unknown noise. Our goals are to (1) estimate the true label of each instance, and (2) learn a classifier that predicts the true label of a new instance. We propose a probabilistic model that considers true label distributions of groups and parameters that represent the noise as hidden variables. The model can be learned based on a variational Bayesian method. In numerical experiments, we show that the proposed model outperforms existing methods in terms of the estimation of the true labels of instances.Comment: Accepted in ICANN201

arXiv.org e-Print Archive

Crossref

Generative Models For Deep Learning with Very Scarce Data

Author: CM Bishop
G Hinton
N Srivastava
Y Bengio
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/03/2019
Field of study

The goal of this paper is to deal with a data scarcity scenario where deep learning techniques use to fail. We compare the use of two well established techniques, Restricted Boltzmann Machines and Variational Auto-encoders, as generative models in order to increase the training set in a classification framework. Essentially, we rely on Markov Chain Monte Carlo (MCMC) algorithms for generating new samples. We show that generalization can be improved comparing this methodology to other state-of-the-art techniques, e.g. semi-supervised learning with ladder networks. Furthermore, we show that RBM is better than VAE generating new samples for training a classifier with good generalization capabilities

arXiv.org e-Print Archive

Crossref

Institutional Effects in a Simple Model of Educational Production

Author: Betts JR
Bishop JH
Bishop JH
Costrell RM
Epple D
Hanushek EA
Hanushek EA
Hoxby CM
Hoxby CM
JOHN H. BISHOP
LUDGER WÖßMANN
Robertson D
Wößmann L
Wößmann L
Publication venue: DigitalCommons@ILR
Publication date: 01/01/2004
Field of study

This paper presents a model of educational production that tries to make sense of recent evidence on effects of institutional arrangements on student performance. In a simple principal-agent framework, students choose their learning effort to maximize their net benefits, while the government chooses educational spending to maximize its net benefits. In the jointly determined equilibrium, schooling quality is shown to depend on several institutionally determined parameters. The impact on student performance of institutions such as central examinations, centralization versus school autonomy, teachers\u27 influence, parental influence, and competition from private schools is analyzed. Furthermore, the model can rationalize why positive resource effects may be lacking in educational production

CiteSeerX

Crossref

Research Papers in Economics

DigitalCommons@ILR

eCommons@Cornell

Cascades on a stochastic pulse-coupled network

Author: Bishop S
Wray CM
Publication venue: Nature Publishing Group
Publication date: 12/09/2014
Field of study

While much recent research has focused on understanding isolated cascades of networks, less attention has been given to dynamical processes on networks exhibiting repeated cascades of opposing influence. An example of this is the dynamic behaviour of financial markets where cascades of buying and selling can occur, even over short timescales. To model these phenomena, a stochastic pulse-coupled oscillator network with upper and lower thresholds is described and analysed. Numerical confirmation of asynchronous and synchronous regimes of the system is presented, along with analytical identification of the fixed point state vector of the asynchronous mean field system. A lower bound for the finite system mean field critical value of network coupling probability is found that separates the asynchronous and synchronous regimes. For the low-dimensional mean field system, a closed-form equation is found for cascade size, in terms of the network coupling probability. Finally, a description of how this model can be applied to interacting agents in a financial market is provided

UCL Discovery

PubMed Central

Representation errors and retrievals in linear and nonlinear data assimilation

Author: Bishop CM
Daley R
Publication venue: 'Wiley'
Publication date: 01/07/2015
Field of study

This article shows how one can formulate the representation problem starting from Bayes’ theorem. The purpose of this article is to raise awareness of the formal solutions,so that approximations can be placed in a proper context. The representation errors appear in the likelihood, and the different possibilities for the representation of reality in model and observations are discussed, including nonlinear representation probability density functions. Specifically, the assumptions needed in the usual procedure to add a representation error covariance to the error covariance of the observations are discussed,and it is shown that, when several sub-grid observations are present, their mean still has a representation error ; socalled ‘superobbing’ does not resolve the issue. Connection is made to the off-line or on-line retrieval problem, providing a new simple proof of the equivalence of assimilating linear retrievals and original observations. Furthermore, it is shown how nonlinear retrievals can be assimilated without loss of information. Finally we discuss how errors in the observation operator model can be treated consistently in the Bayesian framework, connecting to previous work in this area

Central Archive at the University of Reading

Crossref

Bridging the Gap between Probabilistic and Deterministic Models: A Simulation Study on a Variational Bayes Predictive Coding Recurrent Neural Network Model

Author: CM Bishop
J Tani
K Friston
S Cruys Van de
S Murata
Y Yamashita
Publication venue
Publication date: 15/09/2017
Field of study

The current paper proposes a novel variational Bayes predictive coding RNN model, which can learn to generate fluctuated temporal patterns from exemplars. The model learns to maximize the lower bound of the weighted sum of the regularization and reconstruction error terms. We examined how this weighting can affect development of different types of information processing while learning fluctuated temporal patterns. Simulation results show that strong weighting of the reconstruction term causes the development of deterministic chaos for imitating the randomness observed in target sequences, while strong weighting of the regularization term causes the development of stochastic dynamics imitating probabilistic processes observed in targets. Moreover, results indicate that the most generalized learning emerges between these two extremes. The paper concludes with implications in terms of the underlying neuronal mechanisms for autism spectrum disorder and for free action.Comment: This paper is accepted the 24th International Conference On Neural Information Processing (ICONIP 2017). The previous submission to arXiv is replaced by this version because there was an error in Equation

arXiv.org e-Print Archive

Crossref

Analysis of dropout learning regarded as ensemble learning

Author: A Krizhevsky
CM Bishop
D Saad
GE Hinton
K Hara
M Biehl
S Wager
Y LeCun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/06/2017
Field of study

Deep learning is the state-of-the-art in fields such as visual object recognition and speech recognition. This learning uses a large number of layers, huge number of units, and connections. Therefore, overfitting is a serious problem. To avoid this problem, dropout learning is proposed. Dropout learning neglects some inputs and hidden units in the learning process with a probability, p, and then, the neglected inputs and hidden units are combined with the learned network to express the final output. We find that the process of combining the neglected hidden units with the learned network can be regarded as ensemble learning, so we analyze dropout learning from this point of view.Comment: 9 pages, 8 figures, submitted to Conferenc

arXiv.org e-Print Archive

Crossref

Towards Analyzing Semantic Robustness of Deep Neural Networks

Author: A Fawzi
AH Stroud
AN Bhagoji
CM Bishop
G An
RG Ródenas
T Dreossi
Y Grandvalet
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 08/09/2020
Field of study

Despite the impressive performance of Deep Neural Networks (DNNs) on various vision tasks, they still exhibit erroneous high sensitivity toward semantic primitives (e.g. object pose). We propose a theoretically grounded analysis for DNN robustness in the semantic space. We qualitatively analyze different DNNs' semantic robustness by visualizing the DNN global behavior as semantic maps and observe interesting behavior of some DNNs. Since generating these semantic maps does not scale well with the dimensionality of the semantic space, we develop a bottom-up approach to detect robust regions of DNNs. To achieve this, we formalize the problem of finding robust semantic regions of the network as optimizing integral bounds and we develop expressions for update directions of the region bounds. We use our developed formulations to quantitatively evaluate the semantic robustness of different popular network architectures. We show through extensive experimentation that several networks, while trained on the same dataset and enjoying comparable accuracy, do not necessarily perform similarly in semantic robustness. For example, InceptionV3 is more accurate despite being less semantically robust than ResNet50. We hope that this tool will serve as a milestone towards understanding the semantic robustness of DNNs.Comment: Presented at European conference on computer vision (ECCV 2020) Workshop on Adversarial Robustness in the Real World ( https://eccv20-adv-workshop.github.io/ ) [best paper award]. The code is available at https://github.com/ajhamdi/semantic-robustnes

arXiv.org e-Print Archive

Crossref

Signal processing for molecular and cellular biological physics:an emerging field

Author: Arce GR
Bishop CM
Nelson PC
Silverman BW
Publication venue: 'The Royal Society'
Publication date: 13/02/2013
Field of study

Recent advances in our ability to watch the molecular and cellular processes of life in action-such as atomic force microscopy, optical tweezers and Forster fluorescence resonance energy transfer-raise challenges for digital signal processing (DSP) of the resulting experimental data. This article explores the unique properties of such biophysical time series that set them apart from other signals, such as the prevalence of abrupt jumps and steps, multi-modal distributions and autocorrelated noise. It exposes the problems with classical linear DSP algorithms applied to this kind of data, and describes new nonlinear and non-Gaussian algorithms that are able to extract information that is of direct relevance to biological physicists. It is argued that these new methods applied in this context typify the nascent field of biophysical DSP. Practical experimental examples are supplied

Crossref

PubMed Central

Aston Publications Explorer

Outlier detection with partial information:Application to emergency mapping

Author: AP Dempster
CM Bishop
CM Bishop
D D’Alimonte
Dan Cornford
Davide D’Alimonte
G Dubois
IT Nabney
M Markou
S Roberts
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/08/2008
Field of study

This paper, addresses the problem of novelty detection in the case that the observed data is a mixture of a known 'background' process contaminated with an unknown other process, which generates the outliers, or novel observations. The framework we describe here is quite general, employing univariate classification with incomplete information, based on knowledge of the distribution (the 'probability density function', 'pdf') of the data generated by the 'background' process. The relative proportion of this 'background' component (the 'prior' 'background' 'probability), the 'pdf' and the 'prior' probabilities of all other components are all assumed unknown. The main contribution is a new classification scheme that identifies the maximum proportion of observed data following the known 'background' distribution. The method exploits the Kolmogorov-Smirnov test to estimate the proportions, and afterwards data are Bayes optimally separated. Results, demonstrated with synthetic data, show that this approach can produce more reliable results than a standard novelty detection scheme. The classification algorithm is then applied to the problem of identifying outliers in the SIC2004 data set, in order to detect the radioactive release simulated in the 'oker' data set. We propose this method as a reliable means of novelty detection in the emergency situation which can also be used to identify outliers prior to the application of a more general automatic mapping algorithm. © Springer-Verlag 2007

Crossref

Aston Publications Explorer